1 Description

This is a compilation of five homework assignments completed in Spring 2022 for ESS 580: Ecological Data Science.


<!--chapter:end:index.Rmd-->



```r
library(tidyverse)
library(dataRetrieval)
library(dygraphs)
library(xts)
library(yaml)

2 1: Discharge Example

2.1 Methods

The Poudre River at Lincoln Bridge is:

  • Downstream of only a little bit of urban stormwater

  • Near Odell Brewing CO

  • Near an open space area and the Poudre River Trail

  • Downstream of many agricultural diversions

2.2 Site Description

2.3 Data Acquisition and Plotting tests

2.4 Data Download

q <- readNWISdv(siteNumbers = '06752260',
                parameterCd = '00060',
                startDate = '2017-01-01',
                endDate = '2022-01-01') %>%
  rename(q = 'X_00060_00003')

2.5 Static Data Plotter

ggplot(q, aes(x = Date, y = q)) + 
  geom_line() + 
  ylab('Q (cfs)') + 
  ggtitle('Discharge in the Poudre River, Fort Collins')

2.6 Interactive Data Plotter

q_xts <- xts(q$q, order.by = q$Date)

2.7 DyGraph example.

q_xts <- xts(q$q, order.by = q$Date)


dygraph(q_xts) %>%
  dyAxis("y", label = "Discharge (cfs)") %>% dyEvent("2017-5-27", "27 May", labelLoc = "top") %>% dyEvent("2018-5-28", "28 May", labelLoc = "top")%>% dyEvent("2019-6-09", "09 Jun", labelLoc = "top")%>% dyEvent("2020-6-01", "01 Jun", labelLoc = "top")%>% dyEvent("2021-5-23", "23 May", labelLoc = "top")
q_xts <- xts(q$q, order.by = q$Date)
dygraph(q_xts) %>%dyOptions(drawPoints = TRUE, pointSize = 2)

2.8 Poudre Paragraph

The Cache la Poudre, or “Poudre” River is located in the Front Range of Colorado’s Rocky Mountains. The name comes from French trappers who hid their gunpowder near the river. The Arapahoe name for the river is ho’oowu’ heetou’ , which means ‘where a house is located,’ according to CU Boulder’s Center for the Study of Indigenous Languages of the West. The Poudre River is a tributarty of the South Platte, which later joins the Platte River, which is a tributary of the Missouri River, which in turn feeds the Mississippi River and flows into the Gulf of Mexico. According to the Coalition for the Poudre River Watershed, the Poudre supports water supply for over 330,000 residents and 151,547 acres of irrigated land.

3 3: Webscraping and Iterations

3.1 Assignment:

  1. Extract the meteorological data URLs. Here we want you to use the rvest package to get the URLs for the SASP forcing and SBSP_forcing meteorological datasets.c
datapath = 'data/'
dir.create(datapath)
## Warning in dir.create(datapath): 'data' already exists
site_url <- 'https://snowstudies.org/archived-data/'

#Read the web url
webpage <- read_html(site_url)

links <- webpage %>%
  html_nodes('a') %>%
  .[grepl('forcing',.)] %>%
  html_attr('href')

#Grab only the name of the file by splitting out on forward slashes
splits <- str_split_fixed(links,'/',8)

#Keep only the 8th column
dataset <- splits[,8] 

# view(dataset)

file_names <- paste0(datapath,dataset)
  1. Download the meteorological data. Use the download_file and str_split_fixed commands to download the data and save it in your data folder. You can use a for loop or a map function.
#generate a file list for where the data goes
file_names <- paste0('data/',dataset)

for(i in 1:2){
  download.file(links[i],destfile=file_names[i])
}

downloaded <- file.exists(file_names)

evaluate <- !all(downloaded)
  1. Write a custom function to read in the data and append a site column to the data.
# this code grabs the variable names from the metadata pdf file
library(pdftools)
headers <- pdf_text('https://snowstudies.org/wp-content/uploads/2022/02/Serially-Complete-Metadata-text08.pdf') %>%
  readr::read_lines(.) %>%
  trimws(.) %>%
  str_split_fixed(.,'\\.',2) %>%
  .[,2] %>%
  .[1:26] %>%
  str_trim(side = "left")
read_in_weatherdata <- function(file){
name = str_split_fixed(file,'_',3)[,2] 
   df <- read.delim(file, header = F, sep = "", skip =4) %>%  mutate(site = name)
return(df)
}
  1. Use the map function to read in both meteorological files. Display a summary of your tibble.
setwd("~/Spring 2022/ESS580/3_snow_functions_iteration/data")
weather_data_full <- map_dfr(dataset, read_in_weatherdata) 

weather_data_full <- select(weather_data_full, V1, V2, V10, site)%>% rename(Year = V1, Month = V2, temp = V10)

summary(weather_data_full)
##       Year          Month             temp           site          
##  Min.   :2003   Min.   : 1.000   Min.   :242.1   Length:138336     
##  1st Qu.:2005   1st Qu.: 3.000   1st Qu.:265.8   Class :character  
##  Median :2007   Median : 6.000   Median :272.6   Mode  :character  
##  Mean   :2007   Mean   : 6.472   Mean   :272.6                     
##  3rd Qu.:2009   3rd Qu.: 9.000   3rd Qu.:279.7                     
##  Max.   :2011   Max.   :12.000   Max.   :295.8
  1. Make a line plot of mean temp by year by site (using the air temp [K] variable). Is there anything suspicious in the plot? Adjust your filtering if needed.
weather_data_yearlymean <- weather_data_full %>% group_by(Year,site) %>% filter(Year != 2003, Year != 2011) %>% summarize(meantemp = mean(temp))
ggplot(weather_data_yearlymean, aes(x = Year, y = meantemp, color = site )) +
  geom_line() +
  labs( x = "Year", y = "Mean Temperature (K)")

The dataset only included parts of 2003 and 2011, which skewed the yearly average for both years.

  1. Write a function that makes line plots of monthly average temperature at each site for a given year. Use a for loop to make these plots for 2005 to 2010. Are monthly average temperatures at the Senator Beck Study Plot ever warmer than the Snow Angel Study Plot? Hint: https://ggplot2.tidyverse.org/reference/print.ggplot.html
# create a function
yearly_plotter <- function(df, year){
  
  monthly_mean <- df  %>% 
    group_by(Year, Month, site) %>% 
    summarize(meantemp = mean(temp)) %>% filter(year == Year) 
  
   
  figure <- ggplot(monthly_mean, aes(x = Month, y = meantemp, color = site )) +
  geom_line() +
  labs( x = "Month", y = "Mean Temperature (K)") 

  
  print(figure)
}
# run the function in a for loop

x <- c(2005, 2006, 2007, 2008, 2009, 2010)

for(year in x){(yearly_plotter(weather_data_full, year))}

4 4: LAGOS Analysis

4.1 Loading in data

4.1.1 First download and then specifically grab the locus (or site lat longs)

# #Lagos download script
#LAGOSNE::lagosne_get(dest_folder = LAGOSNE:::lagos_path())
#Load in lagos
lagos <- lagosne_load()

#Grab the lake centroid info
lake_centers <- lagos$locus

4.1.2 Convert to spatial data

#Look at the column names
#names(lake_centers)

#Look at the structure
#str(lake_centers)

#View the full dataset
#View(lake_centers %>% slice(1:100))

spatial_lakes <- st_as_sf(lake_centers,coords=c('nhd_long','nhd_lat'),
                          crs=4326) %>%
  st_transform(2163)

#Subset for plotting
subset_spatial <- spatial_lakes %>%
  slice(1:100) 

subset_baser <- spatial_lakes[1:100,]

#Dynamic mapviewer
mapview(subset_spatial)

4.1.3 Subset to only Minnesota

states <- us_states()

#Plot all the states to check if they loaded
#mapview(states)
minnesota <- states %>%
  filter(name == 'Minnesota') %>%
  st_transform(2163)

#Subset lakes based on spatial position
minnesota_lakes <- spatial_lakes[minnesota,]

#Plotting the first 1000 lakes
minnesota_lakes %>%
  arrange(-lake_area_ha) %>%
    slice(1:1000) %>%
  mapview(.,zcol = 'lake_area_ha')

4.2 1) Show a map outline of Iowa and Illinois (similar to Minnesota map upstream)

IA_IL <- states %>%
  filter(name == "Iowa" | name == "Illinois") %>%
  st_transform(2163)


mapview(IA_IL)

4.3 2) Subset LAGOS data to these sites, how many sites are in Illinois and Iowa combined? How does this compare to Minnesota?

IA_IL_lakes <- spatial_lakes[IA_IL,]
count(IA_IL_lakes)
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOINT
## Dimension:     XY
## Bounding box:  xmin: 277299.3 ymin: -824038.9 xmax: 1080254 ymax: -128993.5
## Projected CRS: NAD27 / US National Atlas Equal Area
##       n                       geometry
## 1 16466 MULTIPOINT ((277299.3 -2468...

There are 16466 sites in Iowa and Illinois combined, which is a little more that half the sites in Minnesota alone.

4.4 3) What is the distribution of lake size in Iowa vs. Minnesota?

  • Here I want to see a histogram plot with lake size on x-axis and frequency on y axis (check out geom_histogram)
Q3 <- spatial_lakes %>% filter(state_zoneid == "State_14" | state_zoneid == "State_13") %>% 
  mutate(state = ifelse(state_zoneid == "State_14", paste("Minnesota"), paste("Iowa")))

ggplot(Q3, aes(lake_area_ha))+
  geom_histogram(bins = 4) +
  # scale_x_continuous(breaks = seq(0, 130000, 4))
  facet_wrap(~state) 

 max(Q3$lake_area_ha)
## [1] 123779.8

The majority of lakes in both states are under 25,000 acres. Minnesota has many more lakes.

4.5 4) Make an interactive plot of lakes in Iowa and Illinois and color them by lake area in hectares

IA_IL_lakes %>%
  arrange(-lake_area_ha) %>%
  mapview(.,zcol = 'lake_area_ha')

4.6 5) What other data sources might we use to understand how reservoirs and natural lakes vary in size in these three states?

Lake volume could be a more meaningful metric than lake area for people interested in available water for agriculture or other purposes.

5 5: Weather and Crop Regressions

5.0.1 Load the PRISM daily maximum temperatures

# daily max temperature
# dimensions: counties x days x years
prism <- readMat("prismiowa.mat")

# look at county #1
t_1981_c1 <- prism$tmaxdaily.iowa[1,,1]
t_1981_c1[366]
## [1] NaN
plot(1:366, t_1981_c1, type = "l")

ggplot() +
  geom_line(mapping = aes(x=1:366, y = t_1981_c1)) +
  theme_bw() +
  xlab("day of year") +
  ylab("daily maximum temperature (°C)") +
  ggtitle("Daily Maximum Temperature, Iowa County #1")
## Warning: Removed 1 row(s) containing missing values (geom_path).

# assign dimension names to tmax matrix
dimnames(prism$tmaxdaily.iowa) <- list(prism$COUNTYFP, 1:366, prism$years)

# converted 3d matrix into a data frame
tmaxdf <- as.data.frame.table(prism$tmaxdaily.iowa)

# relabel the columns
colnames(tmaxdf) <- c("countyfp","doy","year","tmax")
tmaxdf <- tibble(tmaxdf)

5.2 Assignment

5.2.1 Question 1a: Extract Winneshiek County corn yields, fit a linear time trend, make a plot. Is there a significant time trend?

winniecorn <- cornyields %>% filter( county_name == "WINNESHIEK") 

ggplot(winniecorn, mapping = aes(x = year, y = yield)) +
  geom_point() +
  theme_bw() +
  labs(x = "year", y = "yield") +
  geom_smooth(method = lm)
## `geom_smooth()` using formula 'y ~ x'

There is a positive trend between year and yield.

5.2.2 Question 1b: Fit a quadratic time trend (i.e., year + year^2) and make a plot. Is there evidence for slowing yield growth?

winniecorn$yearsq <- winniecorn$year^2

lm_cornyield <- lm(yield ~ year + yearsq, winniecorn)
summary(lm_cornyield)
## 
## Call:
## lm(formula = yield ~ year + yearsq, data = winniecorn)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -51.384  -3.115   1.388   9.743  25.324 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  2.583e+04  8.580e+04   0.301    0.765
## year        -2.812e+01  8.576e+01  -0.328    0.745
## yearsq       7.641e-03  2.143e-02   0.357    0.723
## 
## Residual standard error: 17.17 on 38 degrees of freedom
## Multiple R-squared:  0.7559, Adjusted R-squared:  0.7431 
## F-statistic: 58.84 on 2 and 38 DF,  p-value: 2.311e-12
winniecorn$fitted <- lm_cornyield$fitted.values

ggplot(winniecorn) +
  geom_point(mapping = aes(x = year, y = yield)) +
  geom_line(mapping = aes(x = year, y = fitted)) +
  theme_bw() +
  labs(x = "year", y = "yield")

There is not evidence for slowing yield growth.

5.2.3 Question 2 – Time Series: Let’s analyze the relationship between temperature and yields for the Winneshiek County time series. Use data on yield and summer avg Tmax. Is adding year or Tmax^2 to your model helpful? Make a plot and interpret the results.

winniecorn2 <- right_join(winniecorn, winnesummer, by = "year")

ggplot(winniecorn2, mapping = aes(x = meantmax, y = yield)) +
  geom_point() +
  theme_bw() +
  labs(x = "temp", y = "yield") +
  geom_smooth(method = lm)
## `geom_smooth()` using formula 'y ~ x'

lm_q2 <- lm(yield ~ meantmax + year, data = winniecorn2)
summary(lm_q2)
## 
## Call:
## lm(formula = yield ~ meantmax + year, data = winniecorn2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -53.071  -7.269   2.271   9.935  27.505 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4791.774    513.812  -9.326 5.10e-11 ***
## meantmax       -3.201      2.308  -1.387    0.174    
## year            2.514      0.253   9.934 1.01e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.06 on 35 degrees of freedom
## Multiple R-squared:  0.7463, Adjusted R-squared:  0.7318 
## F-statistic: 51.48 on 2 and 35 DF,  p-value: 3.761e-11
winniecorn2$fitted <- lm_q2$fitted.values

ggplot(winniecorn2) +
  geom_point(mapping = aes(x = meantmax, y = yield)) +
  geom_smooth(mapping = aes(x = meantmax, y = fitted)) +
  theme_bw() +
  labs(x = "temp", y = "yield")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

5.2.4 Question 3 – Cross-Section: Analyze the relationship between temperature and yield across all counties in 2018. Is there a relationship? Interpret the results.

In 2018, there is a negative relationship (slope = -4.216) between temperature and yield across counties.

cornyields$countyfp <- as.factor(cornyields$county_ansi)
cornyields2018 <- cornyields %>% filter(year== "2018")

q3 <- tmaxdf %>% filter(year == "2018" & doy >= 152 & doy <= 243) %>% 
  group_by(countyfp) %>% summarize(meantmax = mean(tmax)) %>% 
  left_join(cornyields2018, by = 'countyfp') %>% filter(!is.na(yield))

lm_2018 <- lm(yield ~ meantmax, data = q3)
summary(lm_2018)
## 
## Call:
## lm(formula = yield ~ meantmax, data = q3)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -42.983 -15.041   0.955  16.795  31.999 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  312.466     63.387   4.929 3.69e-06 ***
## meantmax      -4.216      2.241  -1.882   0.0631 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.64 on 91 degrees of freedom
## Multiple R-squared:  0.03745,    Adjusted R-squared:  0.02687 
## F-statistic: 3.541 on 1 and 91 DF,  p-value: 0.06308
ggplot(q3, aes(x = meantmax, y = yield)) +
  geom_point() +
  theme_bw() +
  geom_smooth(method = lm)+
  labs(x = "temp", y = "yield", title = "2018")
## `geom_smooth()` using formula 'y ~ x'

5.2.5 Question 4 – Panel: One way to leverage multiple time series is to group all data into what is called a “panel” regression. Convert the county ID code (“countyfp” or “county_ansi”) into factor using as.factor, then include this variable in a regression using all counties’ yield and summer temperature data. How does the significance of your temperature coefficients (Tmax, Tmax^2) change? Make a plot comparing actual and fitted yields and interpret the results of your model.

In this model, year is the only significant predictor of yield (p < 2e-16).

q4 <-  tmaxdf %>% filter( doy >= 152 & doy <= 243) %>% 
  group_by(countyfp) %>% summarize(meantmax = mean(tmax)) %>% 
  left_join(cornyields, by = 'countyfp') %>% filter(!is.na(yield)) %>%
  mutate(tmaxsq = (meantmax)^2)

lm_q4 <- lm(yield ~ meantmax + tmaxsq + year + countyfp, data = q4)
summary(lm_q4)
## 
## Call:
## lm(formula = yield ~ meantmax + tmaxsq + year + countyfp, data = q4)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -101.016   -9.576    3.141   14.498   52.720 
## 
## Coefficients: (2 not defined because of singularities)
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -8.464e+05  1.240e+06  -0.683    0.495    
## meantmax     6.055e+04  8.920e+04   0.679    0.497    
## tmaxsq      -1.088e+03  1.604e+03  -0.679    0.497    
## year         2.292e+00  2.881e-02  79.557   <2e-16 ***
## countyfp147  1.828e+02  2.634e+02   0.694    0.488    
## countyfp13   1.549e+02  2.235e+02   0.693    0.488    
## countyfp99   1.009e+02  1.583e+02   0.638    0.524    
## countyfp195  1.758e+03  2.559e+03   0.687    0.492    
## countyfp67   4.188e+02  6.073e+02   0.690    0.490    
## countyfp107  4.778e+02  7.491e+02   0.638    0.524    
## countyfp15   3.075e+01  5.189e+01   0.593    0.554    
## countyfp57   6.786e+02  1.036e+03   0.655    0.513    
## countyfp127  2.299e+01  2.543e+01   0.904    0.366    
## countyfp123  4.254e+02  6.600e+02   0.645    0.519    
## countyfp81   8.053e+02  1.164e+03   0.692    0.489    
## countyfp129  1.826e+03  2.747e+03   0.665    0.506    
## countyfp1    4.478e+02  7.060e+02   0.634    0.526    
## countyfp175  3.218e+02  5.328e+02   0.604    0.546    
## countyfp19   4.336e+02  6.241e+02   0.695    0.487    
## countyfp37   8.918e+02  1.299e+03   0.687    0.492    
## countyfp109  5.174e+02  7.408e+02   0.699    0.485    
## countyfp165  1.637e+01  3.829e+01   0.427    0.669    
## countyfp51   1.216e+03  1.877e+03   0.648    0.517    
## countyfp69   2.125e+02  3.006e+02   0.707    0.480    
## countyfp137  1.061e+03  1.613e+03   0.658    0.511    
## countyfp9    1.547e+01  4.068e+01   0.380    0.704    
## countyfp87   8.201e+02  1.258e+03   0.652    0.514    
## countyfp5    1.008e+03  1.468e+03   0.687    0.492    
## countyfp189  1.058e+03  1.534e+03   0.690    0.490    
## countyfp53   6.428e+02  1.022e+03   0.629    0.529    
## countyfp121  2.799e+02  4.567e+02   0.613    0.540    
## countyfp191  1.484e+03  2.161e+03   0.687    0.492    
## countyfp71   2.658e+03  3.982e+03   0.668    0.504    
## countyfp135  3.496e+02  5.833e+02   0.599    0.549    
## countyfp23   7.584e+01  1.070e+02   0.709    0.478    
## countyfp179  8.982e+02  1.386e+03   0.648    0.517    
## countyfp11  -1.469e+00  5.520e+00  -0.266    0.790    
## countyfp193  1.853e+02  3.031e+02   0.611    0.541    
## countyfp117  4.851e+02  7.939e+02   0.611    0.541    
## countyfp139  3.628e+02  5.664e+02   0.641    0.522    
## countyfp141  1.165e+02  1.578e+02   0.738    0.460    
## countyfp3    4.113e+02  6.576e+02   0.626    0.532    
## countyfp29   3.968e+02  6.186e+02   0.641    0.521    
## countyfp7    5.440e+02  8.776e+02   0.620    0.535    
## countyfp95   8.389e+01  1.461e+02   0.574    0.566    
## countyfp77   2.395e+02  3.862e+02   0.620    0.535    
## countyfp55   6.896e+02  9.943e+02   0.694    0.488    
## countyfp91   4.360e+02  6.254e+02   0.697    0.486    
## countyfp79  -1.097e+01  1.619e+01  -0.677    0.498    
## countyfp59   5.552e+02  8.106e+02   0.685    0.493    
## countyfp27  -4.446e+00  5.181e+00  -0.858    0.391    
## countyfp187  2.956e+01  3.574e+01   0.827    0.408    
## countyfp41   1.567e+02  2.274e+02   0.689    0.491    
## countyfp45  -4.398e+00  6.467e+00  -0.680    0.496    
## countyfp153  3.799e+02  5.835e+02   0.651    0.515    
## countyfp65   9.228e+02  1.338e+03   0.690    0.490    
## countyfp131  1.558e+03  2.265e+03   0.688    0.492    
## countyfp143  6.385e+02  9.221e+02   0.692    0.489    
## countyfp197  1.530e+02  2.148e+02   0.712    0.476    
## countyfp133  8.215e+02  1.263e+03   0.651    0.515    
## countyfp83   1.003e+01  1.158e+01   0.867    0.386    
## countyfp35  -3.647e-01  6.425e+00  -0.057    0.955    
## countyfp101  1.590e+03  2.414e+03   0.659    0.510    
## countyfp21   1.024e+02  1.462e+02   0.700    0.484    
## countyfp105  9.314e+01  1.352e+02   0.689    0.491    
## countyfp63   8.682e+02  1.259e+03   0.690    0.490    
## countyfp125  4.463e+02  7.007e+02   0.637    0.524    
## countyfp157  1.460e+02  2.359e+02   0.619    0.536    
## countyfp155  6.594e+02  1.005e+03   0.656    0.512    
## countyfp169 -3.053e-01  4.141e+00  -0.074    0.941    
## countyfp145  1.098e+03  1.677e+03   0.655    0.513    
## countyfp167  6.471e+00  1.128e+01   0.574    0.566    
## countyfp159  4.463e+02  7.341e+02   0.608    0.543    
## countyfp89   2.238e+03  3.269e+03   0.684    0.494    
## countyfp115  8.988e+02  1.371e+03   0.656    0.512    
## countyfp173  6.838e+02  1.080e+03   0.633    0.527    
## countyfp49   2.767e+02  4.330e+02   0.639    0.523    
## countyfp151  3.540e+01  4.702e+01   0.753    0.452    
## countyfp39   4.762e+02  7.841e+02   0.607    0.544    
## countyfp17   3.331e+02  4.740e+02   0.703    0.482    
## countyfp97   1.760e+02  2.663e+02   0.661    0.509    
## countyfp85   6.447e+02  9.903e+02   0.651    0.515    
## countyfp149  8.173e+01  1.397e+02   0.585    0.559    
## countyfp185  3.514e+02  5.931e+02   0.593    0.554    
## countyfp47  -1.532e+01  1.472e+01  -1.041    0.298    
## countyfp111  1.596e+03  2.420e+03   0.659    0.510    
## countyfp177  2.462e+03  3.720e+03   0.662    0.508    
## countyfp75   1.175e+02  1.631e+02   0.720    0.471    
## countyfp181  4.382e+02  6.952e+02   0.630    0.529    
## countyfp31  -9.089e+00  1.612e+01  -0.564    0.573    
## countyfp25   1.676e+01  2.374e+01   0.706    0.480    
## countyfp61   7.567e+02  1.091e+03   0.693    0.488    
## countyfp113  2.509e+01  4.042e+01   0.621    0.535    
## countyfp183  7.615e+02  1.161e+03   0.656    0.512    
## countyfp33   7.052e+02  1.025e+03   0.688    0.491    
## countyfp43   6.247e+02  9.014e+02   0.693    0.488    
## countyfp73   1.641e+02  2.562e+02   0.640    0.522    
## countyfp161  3.584e+01  5.037e+01   0.712    0.477    
## countyfp103  7.277e+01  1.332e+02   0.546    0.585    
## countyfp119 -1.223e+01  1.453e+01  -0.842    0.400    
## countyfp163         NA         NA      NA       NA    
## countyfp171         NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21.43 on 3915 degrees of freedom
## Multiple R-squared:  0.6579, Adjusted R-squared:  0.6493 
## F-statistic: 76.05 on 99 and 3915 DF,  p-value: < 2.2e-16

5.2.6 Question 5 – Soybeans: Download NASS data on soybean yields and explore either a time series relationship for a given county, the cross-sectional relationship for a given year, or a panel across all counties and years.

There is a positive relationship between year and soybean yield in Madison County between 1981 and 2021.

madisonsoy<- soyyields %>% filter( county_name == "MADISON") 

ggplot(madisonsoy, mapping = aes(x = year, y = yield)) +
  geom_point() +
  theme_bw() +
  labs(x = "year", y = "yield") +
  geom_smooth(method = lm)
## `geom_smooth()` using formula 'y ~ x'

5.2.7 Bonus: Find a package to make a county map of Iowa displaying some sort of information about yields or weather. Interpret your map.